Combining Unsupervised Variable Selection with Dimensionality Reduction
نویسنده
چکیده
This paper bridges the gap between variable selection methods (e.g Pearson coefficients, KS test) and dimensionality reduction algorithms (e.g PCA, LDA). Variable selection algorithms encounter difficulties dealing with highly correlated data, as many features are similar in quality. Dimensionality reduction algorithms tend to combine all variables, and are not able to select significant variables out of a set of features. Our approach combines both methodologies by applying variable selection followed by dimensionality reduction. The key point is to optimize the same utility function in both stages. The resulting algorithm is able to benefit from complex features as variable selection algorithms do, and at the same time enjoy the benefits of dimensionality reduction.
منابع مشابه
Exploring the Gap Between Variable Selection and Dimensionality Reduction
The Problem: This project addresses the gap between variable selection algorithms and dimensionality reduction algorithms. Variable selection algorithms are designed to produce sparse solutions where only few variable are marked as relevant variables. This is not suitable for highly correlated data such as gray values of an image. Dimensionality reduction algorithms (e.g PCA) tend to combine al...
متن کاملRelevance Analysis of Stochastic Biosignals for Identification of Pathologies
This paper presents a complementary study of the methodology for diagnosing of pathologies, based on relevance analysis of stochastic (time-variant) features that are extracted from t-f representations of biosignal recordings. Dimension reduction is carried out by adapting in time commonly used latent variable techniques for a given relevance function, as evaluation measure of time-variant tran...
متن کاملA Monte Carlo-Based Search Strategy for Dimensionality Reduction in Performance Tuning Parameters
Redundant and irrelevant features in high dimensional data increase the complexity in underlying mathematical models. It is necessary to conduct pre-processing steps that search for the most relevant features in order to reduce the dimensionality of the data. This study made use of a meta-heuristic search approach which uses lightweight random simulations to balance between the exploitation of ...
متن کاملProbabilistic Additive Component Analysis A Latent Variable Model for Dimensionality Reduction of Human Functional Magnetic Resonance Images
In recent years, an important new application of machine learning research has emerged from the field of cognitive neuroscience. In ‘mind-reading’ experiments, a machine learning classifier is trained to predict aspects of a human subject’s mental state from patterns of brain activity recorded by in a functional MRI (fMRI) scanner. However, a typical fMRI dataset consists of relatively few, noi...
متن کاملSteel Consumption Forecasting Using Nonlinear Pattern Recognition Model Based on Self-Organizing Maps
Steel consumption is a critical factor affecting pricing decisions and a key element to achieve sustainable industrial development. Forecasting future trends of steel consumption based on analysis of nonlinear patterns using artificial intelligence (AI) techniques is the main purpose of this paper. Because there are several features affecting target variable which make the analysis of relations...
متن کامل